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ABSTRACT 


COMPLEXITY  MEASURE 


A  structural  complexity  measure  that  is  useful  for 
generating  morphological  feature  detectors  is  described. 
The  complexity  measure  is  evaluated  using  two-class 
handwritten  character  recognition  experiments.  Results 
suggest  there  is  a  complexity  band  that  can  be  used  to  aid 
in  the  search  for  generalizable  feature  detectors. 


INTRODUCTION 

The  work  described  in  this  paper  is  part  of  a  program  to 
investigate  ways  of  using  adaptive  search  techniques  to 
automate  the  design  of  pattern  recognition  systems.  We 
use  stochastic  search  techniques  to  generate  or  synthesize 
morphological  feature  detectors  based  on  morphological 
erosion  operators  and  hit-or-miss  operators  [1].  These 
operators  utilize  structuring  elements  to  probe  input 
images  for  geometrical  and  topological  features.  A 
structuring  element  is  a  collection  of  pixel  specifications 
that  serves  as  a  template  scanned  over  the  entire  input 
image.  A  hit-or-miss  structuring  element  is  one  which 
contains  both  foreground  and  background  pixel 
specifications.  The  operators  systematically  mark  the 
location  for  each  correspondence  between  the  template 
and  the  input  image,  thus  creating  secondary  output 
images.  When  the  structuring  element  is  a  small  local 
cluster  of  foreground  pixels,  the  erosion  operator  erodes  a 
layer  of  boundary  points  from  the  foreground-defined 
objects.  The  secondary  output  image  in  our  experiments 

indicates  the  presence  or  absence  of  positive 
correspondence  somewhere  in  the  input  image.  A 
structuring  element  when  used  in  an  erosion  operation 
can  be  interpreted  as  a  binary  feature  detector. 

In  recent  work  [2,  3],  we  have  investigated  resource 
allocation  strategies  that  improve  the  efficiency  of  the 
search  technique.  These  strategies  are  generic  because 
they  depend  on  detector  response  to  a  training  set  of 
images  and  not  on  particular  attributes  of  the  detector 
producing  the  responses.  Often  the  various  attributes 
which  may  affect  performance  of  the  structuring  elements 
are  loosely  referred  to  as  complexity.  In  this  paper  we 
address  the  question  of  how  to  assess  a  complexity 
measure.  Our  approach  is  to  define  a  specific  complexity 
measure  and  to  investigate  its  correlation  with 
performance  measures.  Factoring  this  type  of  information 
into  search  strategies  offers  the  promise  of  more  efficient 
algorithms  for  designing  structuring  elements.  Two  other 
basic  questions  are  addressed  below:  What  are  the 
optimal  performance  levels  for  single  detectors?  How  does 
the  performance  generalize  when  a  detector  is  confronted 
with  new  samples  of  handwritten  letters? 


A  complexity  measure  for  structuring  elements  provides  a 
single  parameter  to  rank  the  elements  basea  on 
geometrical  characteristics  of  the  two-dimensional 
distribution  of  pixels.  Structuring  elements  may  have 
different  numbers  of  pixels  and  sizes  within  certain 
pre-defined  limits.  For  example,  the  structuring  elements 
used  in  the  experiments  described  in  this  paper  fit  into  a 
31  x  31  matrix  and  contain  less  than  32  pixels. 

We  define  a  complexity  measure  (C)  to  be  linearly 
dependent  on  the  number  of  pixels  in  the  structuring 
element  (N)  and  on  a  characteristic  dimension  (R).  We  let 
C  =  N  •  R  •  f  ,  where  f  is  a  function  of  the  geometrical 
distribution  of  the  pixels.  In  defining  f,  it  is  convenient  to 
use  polar  coordinates  (r,v)  in  the  center  of  mass  of  the 
structuring  element.  In  this  case,  we  let  the 
characteristic  dimension  R  equal  the  maximum  radial 
coordinate. 

The  distribution  function  is  taken  to  be  the  sum  of  two 
terms  which  characterize  the  angular  and  radial 
dispersions  of  pixels  in  the  structuring  element.  The 
angular  dispersion  is  computed  with  respect  to  eight 

angular  sectors.  The  radial  coordinates  weight  the  pixel 
occupancy  in  each  sector  so  that  one  derives  a  total 
weight  (W,)  for  each  sector  (Sifi=1...8): 

If,  =  I  ^  where  rk  e  5-  (/ =  1,2,3,  ...,8) 


The  weights  Wj  are  used  in  turn  to  compute  an  angular 
entropy  (Ev)  defined  by 


where  T  -  £  Wt 


and  the  maximum  value  for  Ev  is  3,  which  is  obtained  by 
substituting  Wj  =  T/8  in  the  equation  for  Ev.  The  radial 
distribution  is  characterized  by  the  standard  deviation  (S) 
of  the  radial  coordinates.  In  this  paper  we  investigate  the 
following  definition: 


where  we  use  the  normalization  factors  of  6  and  R  to  keep 
the  values  of  f  between  0  and  1. 

Eight  sample  hit-or-miss  structuring  elements  and  their 
corresponding  complexity  measures  are  shown  in  Figure 
1.  The  structuring  elements  are  arranged  in  order  of 
increasing  complexity.  As  the  displacements  between  the 
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pixels  increase,  the  measure  Rmax  increases.  Also  the 
entropy  increases  as  the  pixels  become  more  evenly 
distributed  in  the  structuring  element. 


PERFORMANCE  MEASURES 

The  basic  recognition  rates  used  to  measure  performance 
are: 

a  =  %  of  targets  correctly  identified  as  targets 

p  s  %  of  non-targets  correctly  identified  as 

non-targets 
y  =  (  a  +  p  )  /  2. 

In  our  target  recognition  experiments,  we  use  25 
non-targets  to  every  target.  There  are  eight  target  images 
in  the  training  set  and  200  non-target  characters.  In 
general,  we  want  target  recognizers  to  have  the  ability  to 
recognize  the  target  before  we  take  interest  in  their 
ability  to  discriminate  non-targets.  The  search  algorithms 
used  in  our  experiments  allow  up  to  two  target 
recognition  errors;  hence,  the  three  allowed  values  for  the 
training  set  a  are  1.0,  7/8,  and  6/8.  Within  these  three 
classes,  the  relative  performance  is  determined  by  the 
value  of  p.  which  is  the  average  of  a  and  p,  is  used  as 
an  overall  performance  measure,  y  is  defined  so  that 
correct  identification  of  targets  is  given  more  weight  than 
the  correct  identification  of  non-targets. 


EXPERIMENT 

In  this  experiment,  we  have  limited  our  search  to 
finding  a  single  detector  that  can  distinguish  a  target 
letter  from  the  remaining  letters  in  the  alphabet.  We 
scan  handwritten  letters  of  more  or  less  the  same  size  into 
a  32  x  32  binary  matrix  (Figure  2).  These  images^ contain 
a  large  amount  of  distortion  and  some  differences  in  scale. 
There  is  no  need  to  center  the  images  since  the 
morphological  operators  are  shift  invariant.  The  objective 
of  these  investigations  is  to  optimize  the  performance  of  a 
single  extended  hit-or-miss  (erosion)  structuring  element. 
A  large  population  of  structuring  elements  is  generated  by 
a  stochastic  search  technique  driven  by  performance 
measures  described  above. 

In  Figure  3,  a,  P,  and  y  plots  of  performance  as  a 
function  of  complexity  are  shown  for  the  target  letter  A. 
The  a  graph  shows  structuring  element  complexity  as  a 
function  of  target  recognition  rate  on  training  images. 
Recall,  only  three  levels  of  performance  are  acceptable: 
8/8,  7/8  and  6/8.  Notice  the  slight  increase  in  the  range  of 
complexity  as  the  performance  level  decreases.  This  effect 
is  even  more  noticeable  in  the  a  plot  that  shows  the 
complexity-performance  interaction  of  the  same 
structuring  elements  applied  to  an  independent  set  of  test 
images,  a  is  not  constrained  so  all  levels  of  performance 
(0-8)  appear.  These  plots  indicate  that  the  training  and 
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Figure  1.  Sample  Structuring  Elements  and  Related  Complexity  Measure. 
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Figure  2.  Sample  Handwritten  Training  and  Test  Sets. 
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Figure  3.  Target  letter  A.  All  structuring  elements.  Performance  for  all  structuring 
elements  generated  in  this  experiment.  The  horizontal  and  vertical  axes  record 
complexity  and  performance,  respectively. 


test  performances  on  target  images  is  inversely 
proportional  to  complexity.  The  p  and  fT  plots  show  the 
relationship  between  complexity  and  non-target 
recognition  for  training  and  test  images.  There  is  no 
constraint  on  non-target  recognition  rates  so  performance 
levels  range  between  0  (0/200)  and  1.0  (200/200).  Visual 
inspection  of  p  and  p*  shows  that  structuring  elements 
with  complexities  above  50  have  the  potential  to  achieve 
good  levels  of  performance  with  respect  to  non-target 
recognition.  Non-target  recognition  appears  to  be 
proportional  to  the  complexity  of  a  structuring  element. 
The  y-y’  plots  show  the  weighted  performance  of  the 
structuring  elements  applied  to  training  and  test  sets. 
Typically,  the  combined  recognition  rate  y  ranges  between 


0.5  and  1.0.  When  the  structuring  elements  are  applied 
to  the  test  set,  the  range  of  performance  is  shifted  down  to 
0.35-0.85.  This  downward  shift  is  not  unexpected  and 
indicates  that  the  structuring  elements  are  not  capable  of 
recognizing  some  of  the  variations  present  in  the  test  set. 
The  more  interesting  phenomenon  is  the  behavior  of 
performance  as  it  relates  to  different  levels  of  complexity. 
Since  the  complexity  of  each  individual  structuring 
element  is  the  same  in  y  and  y\  the  number  of  structuring 
elements  with  complexity  greater  that  100  is  the  same  in 
both  plots.  The  downward  shift  in  performance  between  y 
and  y  is  less  dramatic  for  structuring  elements  with 
complexity  less  than  100.  This  behavior  suggests  that 
complexity  influences  the  ability  of  structuring  elements 
to  generalize. 


1135 


The  y  plot  shown  in  Figure  3  incorporates  a  restricted 
a  (>  0.75)  and  unrestricted  p.  In  Figure  4,  y  is  plotted 
with  the  restriction  that  both  a  and  p  on  the  training  set 
must  be  greater  than  or  equal  to  0.75.  Using  the  same 
structuring  elements  shown  in  Figure  3,  the  new 
restriction  on  p  eliminates  individuals  with  performance 
below  0.75  on  the  training  set  but  does  not  significantly 
alter  the  behavior  of  y\  Only  the  structuring  elements 
that  produce  training  and  test  set  performances  above 
0.75  are  shown  in  the  second  pair  of  y-y*  plots  (see  Figure 
4).  These  results  clearly  reveal  the  location  of  a  bounded 
complexity  band  (complexity  =  10..  125)  that  contains  the 
optimal  detectors. 

Figure  5  shows  complexity  bands  (y-y*)  for  the  letters 
H,  I,  and  Y.  For  these  letters,  the  stochastic  search 
process  is  able  to  generate  structuring  elements  with 
recognition  rates  of  0.85+  for  the  training  images  and 
0.75+  for  test  images.  The  position  and  size  of  the 
complexity  band  varies  for  different  characters.  The 
complexity  bands  for  the  letters  H,  I,  and  Y  are 
approximately  (25,  175),  (25  to  225)  and  (25,  275) 
respectively. 


SUMMARY 

Extended  structuring  elements  can  be  readily 
customized  to  discriminate  target  letters  from  non-target 
letters.  There  are  a  few  letters  that  are  difficult  to 
customize,  such  as  the  letter  T.  We  have  examined  the 
relationship  between  complexity  measures  and 
recognition  rates  for  letters  of  the  alphabet  and  have 
presented  typical  results  which  characterize  these 
experiments. 

Opposing  forces  appear  to  be  operating  during  the 
structuring  element  generation  process.  Target 
recognition  rates  on  training  and  test  sets  improve  as 
structuring  element  complexity  decreases  while 
non-target  recognition  rates  improve  as  structuring 
element  complexity  increases.  The  complexity  range 
overlap  that  produces  good  recognition  rates  for  target 
and  non-target  images  defines  a  small  complexity  band 
that  may  be  useful  in  the  construction  of  structuring 
elements  capable  of  responding  to  invariant  features. 

The  complexity  measure,  C,  defined  for  this 
experiment  is  not  unique,  however,  it  does  depend  on 
certain  properties  characteristic  of  a  structuring  element’s 
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Figure  4.  Target  letter  A.  Filtered  structuring  elements.  The  horizontal  and  vertical 
axes  record  complexity  and  performance,  respectively.  Better  performers  are  filtered 
from  the  total  population  using  the  ’good”  criteria,  i.e.  a  >  .75  and  p  >  .75.  The  top 
pair  are  filtered  by  requiring  "good"  performance  with  the  training  set.  The  bottom 
pair  are  filtered  by  requiring  "good"  performance  with  both  the  training  and  test  sets. 
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Figure  5.  y  Performance  of  Letters  I,  H,  and  Y.  The  filtered  y  performance  for 
structuring  elements  applied  to  the  training  and  test  sets  are  shown. 
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complexity.  The  general  relationships  presented  in  this 
paper  between  C  and  performance  can  be  used  to  improve 
the  efficiency  of  search  strategies  used  to  automatically 
generate  structuring  elements.  Because  the  relationship 
between  complexity  and  performance  measures  is 
consistent  and  understandable,  our  definition  of  C 
provides  a  baseline  for  further  study  of  these  issues. 

ACKNOWLEDGEMENT 


This  research  was  supported  in  part  by  the  Air  Force 
Office  of  Scientific  Research,  Directorate  of  Electronic  and 
Material  Sciences  (Task  2305R5)  and  Wright  Research 
and  Development  Center  through  the  Miami  Valley 
Research  Institute,  Center  for  Artificial  Intelligence 
Applications  (Air  Force  Contract  F33615-87-C-1550). 


REFERENCES 

[1]  J.  Serra,  Image  Analysis  and  Mathematical 
Morphology ,  Academic  Press,  London,  1982. 

[2]  L.  A.  Tamburino,  M.  M.  Rizki,  ’’Automatic  generation 
of  binary  features",  IEEE  Aerospace  and  Electronics 
Magazine,  September  1989. 

[3]  L.  A.  Tamburino,  M.  M.  Rizki  and  M.  A.  Zmuda, 
"Computational  resource  management  in  supervised 
learning  systems",  NAECON,  Vol.  3,  1989. 


1138 


